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FAULT DETECTION IN A PHYSICAL SYSTEM 



The U.S. Government has a paid-up license in this invention and the right in 
limited circumstances to require the patent owner to license others on reasonable terms 
5 as provided for by the terms of Contract No. F33615-98-C-2890 awarded by the Air 
Force Research Laboratory, Wright-Patterson AFB. 



BACKGROUND 

The present invention relates in general to the detection of system faults and more 

10 particularly to the detection of anomalies in a physical system. 

The maintenance and monitoring of physical systems, including complex systems like 
aircraft engines, rocket propulsion systems, and aerospace vehicles, is important for the 
prevention and detection of abnormal operating conditions. In particular, it is desired to 
detect operating conditions of the physical system that correspond to unknown fault modes, or 

15 simply anomalies. 

Traditional approaches have not been effective in detecting certain types of faults or 
failures, especially the detection of anomalies in complex systems. The detection of 
anomalies is typically more difficult than the detection of known failure modes because the 
failure mode has not been previously identified or categorized. Some prior failure detection 

20 approaches are based on data-driven signal-processing that examines the statistical 
characteristics of measured data streams obtained from a system. However, these types of 
approaches are not well-suited to detecting anomalies of a system that experiences large 
variations in operating variables and frequent mode switching, and have only provided limited 
accuracy in detecting such anomalies. Further, these and other types of fault detection 

25 approaches have required significant amounts of domain expertise or physical knowledge 
about the system, thus increasing the cost and difficulty of detecting anomalies. Anomaly 
failure detection by such approaches is further complicated in complex systems due to the 
wide variation of operating conditions, especially when the system is not at steady-state. 
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Accordingly, there is a need for an improved way to detect anomalies in physical 
systems that reduces the extent of knowledge required about the system, that can handle 
failure modes that exceed the data parameter space collected about the prior operation of the 
system, and that can readily handle anomaly detection in the complicated operational modes 
5 observed in complex physical systems. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 illustrates a failure detection system according to the present invention; 

FIG. 2 illustrates the general operational states of a physical system; 
10 FIG. 3 illustrates the inputs and outputs in a physical model; 

FIGs. 4 and 5 are flow charts illustrating steps in a failure detection method according 
to the present invention; and 

FIG. 6 is a table illustrating an example of actual and derived variables for a gas- 
turbine engine system. 

15 

DETAILED DESCRIPTION 

FIG. 1 illustrates a fault or failure detection system 100 according to the present 
invention. System 100 is used to detect faults in a physical system 102, such as for example a 
gas-turbine engine or an air vehicle. Sensors 104, 105, 106, 108 are used to measure 

20 operating conditions or variables about physical system 102. Examples of such conditions 
include temperature, pressure, flow rates, and speed. A computer system 1 10 receives the 
measured variables from sensors 104-108 and processes these measurements to detect a fault 
as described in more detail below. A user interface 112 is coupled to computer system 110 
and used to alert a user to a fault condition. Interface 112 may alternatively be an interface to 

25 another machine or computer system (not shown) by which computer system 110 can initiate 
an event or action in the other machine or computer system in response to a fault detection. 

A storage medium 114, for example a computer hard drive or other non- volatile 
memory storage unit, stores computer programs used to operate computer system 110 
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according to the method of the present invention as described below. A control system 116 
provides control signals (indicated simply as "CONTROL SIGNAL") to control the operation 
of physical system 102. Computer system 110 provides a FAULT signal to control system 
116, which may be used to initiate a change in a control variable of physical system 102 if a 
5 fault is detected. 

FIG. 2 illustrates the general operational states of physical system 102, which are 
graphically represented as regions 200, 202, 204, and 206 in a circle 201. Circle 201 
represents all possible conditions of physical system 102. More specifically, regions 200 and 
202 correspond to known operational states of physical system 102, where region 200 

10 represents known normal states and region 202 represents known faults or failure modes. 

Regions 204 and 206 correspond to unknown operational states of physical system 
102, where region 206 represents unknown faults and region 204 represents unknown normal 
states. The fault detection system and method according to the present invention is primarily 
directed to detecting faults that fall within region 206. These unknown faults are generally 

15 referred to herein as anomalies. Anomalies include both continuing and intermittent faults. It 
should also be appreciated that the present invention is applicable to and useful for detecting 
known faults. 

Because anomalies correspond to unknown types of failures, they are generally the 
most difficult type of fault to detect in part because these types of failures are difficult to 

20 model. As will be discussed further below, the present invention improves the ability to 
detect anomalies to permit corrective action such as, for example, computer system 110 
initiating a change in the CONTROL SIGNAL provided by control system 116 to physical 
system 102 or providing an alert through interface 112 that leads to corrective maintenance 
action during a scheduled down time for physical system 102. 

25 FIG. 3 illustrates the inputs and outputs in a physical model 300 that is used to model 

the physical behavior of physical system 102. According to the present invention as discussed 
further below, physical model 300 is selected or developed for estimating expected output 
variables y es timated based on measured input variables Xi. The expected output variables are 
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considered to be dependent variables in physical model 300, and the measured input variables 
Xj are considered to be independent variables. 

Variables Xj correspond to measurements of actual physical conditions taken from 
physical system 102 using, for example, sensors 104, 105, 106 and 108. It should be noted 
5 that FIG. 1 is simplified, and in an actual complex system, there will typically be many 
sensors or other types of measuring devices that can provide data representing variables Xj. 
Some of these sensors provide independent variables for use in model 300 and other of these 
sensors provide other measured variables that can be compared to dependent variables 
calculated using the model. 

10 Typically, physical model 300 is represented in a software program stored on storage 

medium 114 and executed on computer system 110. An example of a simple physical model 
is F = m*a, where F is force, m is the mass of an object, and a is the acceleration of a moving 
object measured by a sensor such as an accelerometer. Another example of a physical model 
is P = c*p*T, where P is pressure, c is a constant, p is the density of a gas, and T is 

15 temperature. Variables y es timated (for example, the pressure Pestimated) are in general compared 
to measured variables other than those used as independent variables Xi (for example, the 
temperature T) in model 300, such as for example data measured and collected using sensor 
104, to determine the presence or absence of an anomaly. 

FIGs. 4 and 5 are flow charts illustrating steps in a failure detection method according 

20 to the present invention. Specifically, FIG. 4 illustrates steps in the selection of a model and 
the independent variables x; for use in the model according to the present invention. In step 
400, physical model 300 is developed or selected for use in fault detection system 100. 
Model 300 is a physical model that is preferably based at least in part on first principles of 
physics, such as for example, the models F = m*a or P = c*p*T as described above. Model 

25 3 00 also preferably includes a model update scheme, which can be accomplished through the 
use of neural networks or other data-driven correction approaches. Model 300 may be 
represented generally as y es timated = f(xi) * rj(Xi, t) where f(xj) is the primary component of the 
physical model and Tj(xi, t) is a data-driven correction factor, which may be implemented for 
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example as a correction factor table having data that is updated with time. The use of the 
correction factor r|(x i5 t) reduces the need to know completely how physical system 102 
works. As part of the model update scheme mentioned above, r|(xi, t) can be represented in a 
data table that is updated periodically using calibration results, test or inspection results, or 

5 other more accurate or complete models of physical system 102. 

Model 300 may be selected from models already developed by the manufacturer or 
other testing entity of physical system 102, or model 300 may be developed using first 
principles of physics appropriate for system 102. Model 300 may be a simplified physical 
model because the data-driven correction factors reduce the need for sophistication. It is 

10 preferred that selected model 300 be an adaptive physical model such that the parameters in 
the model change with time to adapt to changing system conditions or other factors so that 
model 300 is more closely matched to the current state of physical system 102. 

In step 402, the actual measured variables associated with physical system 102 are 
identified. These variables generally include some control variables, which set the operating 

15 conditions of physical system 102. As an example, the actual measured variables may include 
pressure (P) and temperature (T). These variables generally correspond to those conditions 
that are measured by sensors 104-108 of FIG. 1. This set of actual measured variables will 
include both variables that will later be selected as independent variables Xj and variables that 
will be used as actual output variables y actua i for comparison with variables y es timated. 

20 According to the present invention, in step 404, a subset of hardware redundant 

measured variables is identified from the set of actual measured variables determined in step 
402. These hardware redundant measured variables correspond to those variables that are 
measured using two or more sensors. For example, referring to FIG. 1, sensors 106 and 108 
are illustrated as sensing the same condition or variable from physical system 102. Thus, this 

25 variable would be classified as hardware redundant. All or a portion of the selected set of 
hardware redundant variables, as determined by the specific modeling needs of physical 
system 102 and as described further below, will be used as independent input variables in 
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model 300. The use of hardware redundant variables is advantageous because they 
significantly increase measurement reliability. 

In step 406, the number of hardware redundant measure variables is compared to the 
degrees of freedom of physical system 102. The degrees of freedom generally determine the 

5 number of independent input variables Xj needed for modeling physical system 102. If the 
size of the subset of redundant variables is equal to the number of independent variables 
needed in model 300, then in step 412 the subset is used as independent variables Xj. In step 
414, if there is an insufficient number of redundant variables, then additional sensors are 
added to physical system 102 until the number of independent variables at least equals the 

1 0 degrees of freedom. 

If the size of the subset of redundant variables is greater than the number of 
independent variables needed in model 300, then in step 408 the entire set of redundant 
variables is ranked by the reliability of the measurement. This reliability may be determined 
as the confidence of obtaining an accurate measurement from the existing or selected sensors 

15 for a given variable. In step 410, after the redundant variables have been ranked, then a 
subset of the redundant measured variables is created by selecting the required number of 
most reliable redundant variables to be used as independent variables Xi. 

FIG. 5 illustrates steps in the formulation (or casting) of the selected model in a form 
for use according to the method of the present invention. Specifically, following step 410 or 

20 414 as is applicable, in step 500 the model 300 selected in step 400 is formulated to use only 
the variables x f selected as discussed above for FIG. 4 as independent variables in model 300. 
The dependent variables y es timated will be calculated using variables Xi. 

In step 502, expected output variables yestimated are determined using model 300 as 
formulated in step 500. Computer system 110 receives redundant measured variable inputs 

25 from sensors 106 and 108 or additional measured variables which may have superior 
measurement reliability (such as from sensor 105). Computer system 110 is executing a 
software program that uses model 300 to calculate variables y est imated. Computer system 110 
also receives other actual measured variables, for example from sensor 104, that correspond to 
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measured output variables y ac tuai that will be compared to variables y es timated- Model 300 can 
also be expanded to include derived variables or synthesized variables, which are internal 
variables of physical system 102 not measured directly by sensors 104-108. 

In step 504, computer system 110 compares variables y es timated to the actual measured 
5 output variables y ac tuai to calculate residuals for each dependent variable modeled by model 
300. These residuals represent the deviations or differences between the estimated and 
measured variables. If derived variables are included in model 300, such comparison or 
residual generation is either not performed for such derived variables or is performed between 
the derived variables and the estimated variables based on other sources of information or 

10 knowledge about physical system 102. 

In step 506, the software program executing on computer system 110 analyzes the 
residuals to detect the presence of an anomaly. Conventional residual analysis techniques 
may be used to perform this analysis. Such techniques include, for example, thresholding and 
classification. Thresholding is preferably done first and involves determining whether each 

15 residual is greater than a predetermined threshold limit. If this limit is exceeded, then the 
output variable corresponding to that residual is considered to be anomalous. Accordingly, 
thresholding can be used to determine individual signal anomalies. 

Classification involves an examination of the pattern of some or all of the residuals. 
Classification is typically used to detect an anomalous operating condition when thresholding 

20 fails to detect an individual signal anomaly, for example when all residuals are within their 
respective threshold limits. Classification may detect a system anomaly when the residual 
pattern indicates a new class or known failure mode. It should be noted that classification 
generally detects only a system or a functional anomaly, and not an individual signal 
anomaly. 

25 FIG. 6 is a table illustrating an example of actual and derived variables for the case 

where physical system 102 is a gas-turbine engine system. Actual measured variables are 
listed along with the physical condition or variable of the engine system to which the actual 
measured variable corresponds. The actual measured variables are measured, for example, 
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using sensors 104-108. An example of a derived variable for the engine system is also shown 
with its corresponding physical condition. 

In an engine system, typical independent variables that may be used are P2, T2, Nl, 
and N2. These variables should be either hardware redundant or more reliable than other 
5 measurements as discussed above. An example of an output variable is P3. Model 300 may 
model PSestimated as a function of P2, T2, Nl, and N2, or simply set forth as P3 est imated = fi (P2, 
T2, Nl, and N2). As discussed above, P3 es timated is compared to the actual measured value of 
P3 to calculated a residual value for further analysis. Derived variable T4 also may be 
modeled as a function of P2, T2, Nl, and N2, or simply set forth as T4 es timated = ?2 (P2, T2, Nl, 
10 and N2). Derived variable T4 is used in analysis as generally discussed above for derived 
variables that may be included in model 300. 

Advantages and Variations 

By the foregoing description, a novel and unobvious method and system for detecting 
15 faults in a physical system has been disclosed. The fault detection system and method of 

present invention has the advantages of improved anomaly detection in part due to the use of 
more robust and reliable inputs than prior approaches and in part due to the method of 
formulating a physics-based model that provides improved system operating insights and the 
capability to estimate certain operating variables of the physical system. In addition, less 
20 expense and time is required to develop the model of the system and less knowledge is 

required about the system than with prior approaches directed to fault detection in complex 
physical systems. 

Although specific embodiments have been described above, numerous modifications 
and substitutions may be made thereto without departing from the spirit of the invention. For 
25 example, the fault detection method and system according to the present invention may be 
used with a wide variety of physical systems in addition to those described above. Further, 
the present invention can be applied generally to fault detection and isolation, and is not 
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illustration rather than limitation. 
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